Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering
نویسندگان
چکیده
Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the ‘clustering-friendly’ latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network’s ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach. Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis MN 55455, USA. Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA 50011, USA. Correspondence to: Bo Yang , Xiao Fu , Nicholas D. Sidiropoulos , Mingyi Hong . Proceedings of the 34 th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017. Copyright 2017 by the author(s).
منابع مشابه
Supplementary material of “ Towards K - means - friendly Spaces : Simultaneous Deep Learning and Clustering ”
where σ(·) is the sigmoid function as before and W ∈ R100×2 is similarly generated as in the paper. We perform elementwise squaring on the result features to further complicate the generating process. The corresponding results can be seen in Fig. 1 of this supplementary document. One can see that a similar pattern as we have observed in the main text is also presented here: The proposed DCN rec...
متن کاملLearning Deep Parsimonious Representations
In this paper we aim at facilitating generalization for deep networks while supporting interpretability of the learned representations. Towards this goal, we propose a clustering based regularization that encourages parsimonious representations. Our k-means style objective is easy to optimize and flexible, supporting various forms of clustering, such as sample clustering, spatial clustering, as...
متن کاملImproving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملTowards Explaining the Speed of k-Means
The problem of clustering data into classes is ubiquitous in computer science, with applications ranging from computational biology over machine learning to image analysis. The k-means method is a very simple and implementation-friendly local improvement heuristic for clustering. It is used to partition a set X of n d-dimensional data points into k clusters. (The number k of clusters is fixed i...
متن کاملSimple Deep Random Model Ensemble
Representation learning and unsupervised learning are two central topics of machine learning and signal processing. Deep learning is one of the most effective unsupervised representation learning approach. The main contributions of this paper to the topics are as follows. (i) We propose to view the representative deep learning approaches as special cases of the knowledge reuse framework of clus...
متن کامل